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1.  Project  Objectives: 

This  project’s  objectives  were  to  create,  implement,  and  evaluate  distributed  computational 
mechanisms  for  automating  composition  and  scheduling  of  network-based  services  in 
applications  where  timing  is  important.  For  example,  a  commander  might  need  intelligence  to  be 
gathered,  processed,  summarized,  and  visualized,  but  with  timing  preferences,  such  as  that  the 
results  are  available  by  a  deadline  but  should  be  as  recent  as  possible  rather  than  having  gone 
stale.  Preferences  and  constraints  on  timing  impact  which  service  providers  are  selected 
(depending  on  availability),  which  services  they  will  provide  (depending  on  the  timing 
characteristics  for  different  levels  of  service  provision),  and  when  each  should  begin  and 
complete  delivery  of  its  promised  service(s).  Further  complications  arise  due  to  factors  such  as 
uncertainty  over  how  long  some  services  might  take,  competing  service  requests  for  scarce 
services,  and  inherent  distribution  of  authority  and  private  knowledge  across  service  providers 
and  requesters.  The  project’s  technical  thrusts  were  to  develop  novel  extensions  to  temporal 
reasoning,  multiagent  sequential  decision-making,  and  distributed  constraint  optimization,  which 
contribute  collectively  to  solving  such  problems. 

2.  Summary  of  Significant  Work  Accomplished 

The  project  made  substantial  contributions  to  the  science  and  engineering  of  computational 
techniques  for  multiagent  sequential  decisionmaking,  distributed  constraint  optimization,  and 
temporal  planning.  Each  of  these  is  summarized  below,  where  full  descriptions  of  each  of  these 
contributions  are  available  in  the  references  cited. 

2.1  Multiagent  Sequential  Decision-Making  for  Service  Composition  and  Coordination 

Agents  that  are  cooperatively  providing  services  to  each  other  generally  act  in  uncertain  domains, 
where  how  long  it  will  take  to  provide  a  service,  and  the  quality  of  the  result  of  a  service  provision, 
might  not  be  fully  predictable.  Conventional  methods  to  solve  such  problems  draw  on  multiagent 
sequential  decision-making  techniques  that  explicitly  coordinate  the  agents’  joint  policy  decisions. 
These  techniques  are  inherently  susceptible  to  the  curse  of  dimensionality,  as  the  agents’  state, 
action,  and  observation  spaces  grow  exponentially  with  the  number  of  agents.  This  project  has 
made  fundamental  advancements  to  solving  such  problems  by  developing  principled 
representations  and  algorithmic  techniques  that  allow  agents  to  coordinate  at  a  more  abstract 
level  of  influences.  Intuitively,  the  idea  is  that  an  agent  (such  as  an  agent  requesting  a  service) 
need  not  always  know  the  full  policy  of  an  agent  with  whom  it  is  cooperating  (such  as  an  agent 
providing  a  service),  but  instead  needs  to  know  only  how  the  other’s  policy  will  materially 
influence  its  own  plans,  such  as  when  the  service  will  actually  be  provided,  and  not  what  other 
service  provision  requests  will  be  fulfilled  before  and  afterward  [WD2008,  WD2009a,  WD2009b, 
WD2009C]. 

These  ideas  were  developed  in  the  recently-completed  dissertation  of  Stefan  Witwicki  [W201 1]. 
That  work  has  derived  a  new  complexity  characterization  of  the  joint  policy  coordination  problem, 
combining  several  complementary  aspects  of  weakly-coupled  problem  structure,  including  agent 
scope  size  (corresponding  to  the  number  of  an  agent's  peers  whose  decisions  influence  the 
agent's  decisions),  state  factor  domain  size  (corresponding  the  space  of  (belief)  states  that  an 
agent  must  model  in  order  to  plan  optimal  decisions),  and  degree  of  influence  (corresponding  to 
the  proportion  of  unique  influences  that  peers  can  feasibly  exert).  Studied  separately,  these 
aspects  provide  a  language  for  describing  various  conditional  independencies  that  may  exist 
among  the  plans  of  a  group  of  agents.  Together,  these  three  aspects  define  a  three-dimensional 
landscape  that  can  be  used  to  quantify  the  advantage  gained  through  exploiting  a  problem's 
interaction  structure  and,  ultimately,  to  predict  the  amount  of  computation  needed  to  solve  the 


problem  [WD201 1].  For  agents  that  model  their  world  using  a  Decentralized  POMDP  (Dec- 
POMDP),  a  bound  on  the  worst-case  computational  complexity  of  optimal  planning  can  be 
derived  as  follows: 


numberof  best  overhead  of  enumerating  influences 
responses 


complexity  of 
best  response 


where  n  is  the  number  of  agents.  Weakly-coupled  agents  will  tend  to  have  a  smaller  state  factor 
scope  size  (fewer  features  of  each  others’  states  that  they  can  affect),  a  smaller  degree  of 
influence  (fewer  different  ways  that  their  planned  actions  can  affect  each  other),  and  a  smaller 
agent  scope  size  (fewer  agents  that  can  affect  any  particular  agent).  More  importantly,  however, 
capturing  these  aspects  in  a  single  formula  helps  explain  and  predict  performance  in  situations 
where  agents  coupled  to  different  degrees  along  the  different  dimensions. 

To  emphasize  weakly-coupled  structure,  the  research  conducted  in  this  project  introduced  a 
(transition-dependent  decentralized  POMDP)  model  that  efficiently  decomposes  into  local 
decision  models  with  shared  state  features.  For  instance,  features  that  are  controlled  by  one 
agent,  but  that  directly  affect  the  consequences  of  another  agent's  actions,  such  as  happens 
between  agents  requesting  and  providing  a  particular  service  application,  are  included  in  both 
agents'  models.  From  the  perspective  of  the  affected  agent,  these  are  referred  to  as  nonlocally- 
controlled  features.  In  essence,  the  conventionally-specified  Dec-POMDP  model  has  been 
decoupled  into  a  set  of  local  POMDPs  tied  to  one  another  by  the  transition-dependence  among 
their  nonlocally-controlled  features;  this  new  model  is  thus  referred  to  as  a  Transition-Decoupled 
POMDP  (TD-POMDP).  In  comparison  with  related  models,  the  TD-POMDP  achieves  an  effective 
balance  in  its  articulation  of  exploitable  structure  and  its  loss  of  generality  [WD2010a,  WD2010b, 
WD2010C,  W2011]. 

With  the  TD-POMDP  model  structure  thus  defined,  interagent  influence  may  be  characterized 
quite  simply  as  the  expected  transition  probabilities  of  nonlocally-controlled  features.  Since  these 
probabilities  are  the  only  components  of  an  agent's  local  model  that  may  vary  with  the  behavior  of 
its  peers,  entire  peer  policies  can  be  abstractly  summarized  by  the  influences  they  entail,  and  the 
corresponding  probabilities  incorporated  into  the  transition  model  of  a  single-agent  POMDP  that 
serves  as  the  agent’s  local  decision  model.  The  transition  probabilities  associated  with  a 
particular  influence  can  be  encoded  with  a  probability  distribution  Pr  (rijlfife,...),  where  n,  are  new 
values  of  nonlocally-controlled  features  conditioned  on  previous  values  of  various  state  features 
f i ,  f 2,  etc.  This  work  has  proven  that,  for  any  TD-POMDP,  the  influences  for  the  system  of  agents 
can  be  jointly  specified  with  a  Dynamic  Bayesian  Network  (DBN)  containing  only  (variables 
representing  the  past  and  present  values  of)  shared  state  features.  This  project’s  investigators 
have  also  proven  that  this  influence  representation  is  sufficient  for  optimal  coordination,  through 
the  use  of  an  influence  space  search  methodology. 

In  essence,  the  joint  policy  formulation  problem  has  been  decoupled  into  the  subproblems  of  (1) 
proposing  influences,  (2)  evaluating  influences,  and  (3)  computing  optimal  policies  around 
influences.  A  mixed-integer  linear  programming  (MILP)  methodology  has  been  developed  for 
solving  each  of  these  subproblems.  In  contrast  with  prior  approaches  geared  towards  enforcing 
interacting  behavior,  this  novel  methodology  enables  an  agent  to  determine  whether  a  desired 
influence  is  feasible,  if  so  to  compute  the  optimal  local  policy  that  is  constrained  to  exert  the 
influence,  and  to  completely  avoid  any  tuning  of  parameters  associated  with  influence 
enforcement  [WD2010a,  WD2010b,  W2011]. 


The  primary  advantage  of  working  in  the  influence  space  is  that  there  are  potentially  significantly 
fewer  feasible  influences  than  there  are  policies.  Blending  prior  work  on  decoupled  joint  policy 
search  and  constraint  optimization,  the  investigators  have  developed  influence-space  search 
algorithms  that,  for  problems  with  a  low  degree  of  influence,  compute  optimal  solutions  orders  of 
magnitude  faster  than  policy-space  search.  When  agents'  influences  are  constrained,  influence- 
space  search  also  outperforms  other  state-of-the-art  optimal  solution  algorithms. 

The  graphs  below  are  examples  from  a  rigorous  empirical  comparison  of  the  optimal  influence- 
space  search  (OIS)  algorithm  against  three  other  state-of-the-art  optimal  solution  algorithms 
tailored  for  specialized  Dec-POMDP  problems.  As  demonstrated,  when  agents'  windows  of 
interaction  are  small,  indicating  that  agents'  influences  are  most  heavily  constrained,  influence 
abstraction  is  able  to  exploit  this  structure  to  gain  exponential  advantage  (on  average)  over  each 
of  the  other  algorithms,  which  are  less  effective  at  exploiting  this  form  of  weakly-coupled  structure 
[WD2010a,  WD2010b]. 


Moreover,  by  exploiting  both  degree  of  influence  and  agent  scope  size  in  a  bucket-elimination 
variation  [W201 1]  of  the  OIS  algorithm  (labeled  be-ois  in  the  graph  below,  compared  to  a  depth- 
first  OIS  implementation),  the  investigators  have  demonstrated  scalability,  substantially  beyond 
the  reach  of  prior  optimal  methods,  to  teams  of  50  weakly-coupled  transition-dependent  agents, 
as  shown  below. 


Runtime 


Extensions 

During  the  course  of  this  project,  two  Masters  students  have  done  research  that  has  extended  the 
ideas  described  above  to  increase  the  range  of  problems  that  can  be  tackled.  One  student,  Anna 
Chen,  has  investigated  how  the  concept  of  “influence”,  and  especially  the  notion  that  agents 
make  commitments  about  how  they  will  influence  each  other,  extends  to  problems  where  the 


agents  might  be  uncertain,  at  the  outset  of  execution,  about  what  states  of  the  world  are  the  most 
important  to  be  reached.  Such  problems  arise  in  planning  and  scheduling  services,  because  in 
the  midst  of  executing  a  planned  set  of  services,  new  (high  priority)  service  requests  might  arrive. 

Her  work  [CDSW201 1]  has  characterized  the  conditions  under  which  agents  can  still  maximize 
their  expected  joint  performance  by  optimizing  based  on  the  mean-reward,  rather  than  having  to 
consider  every  possible  reward  function  separately.  It  has  also  developed  an  iterative  greedy 
algorithm  for  circumstances  where  mean-reward  is  not  optimal,  and  has  empirically  demonstrated 
that  the  algorithm  can  run  considerably  faster  than  does  reasoning  about  all  possible  reward 
functions,  with  only  modest  sacrifices  in  quality.  Intuitively,  the  idea  is  that  if  a  service  provider 
“hedges”  its  commitments  to  retain  some  degree  of  local  slack,  then  in  expectation  the  joint 
performance  can  improve  because  it  can  be  responsive  to  emergent  opportunities.  Beyond  this 
intuition,  though,  is  the  question  of  “how  much”  slack  is  the  right  amount  to  introduce,  and  the 
work  provides  some  preliminary  answers  to  that  question. 

The  investigators’  work  on  improving  tractability  of  finding  joint  policies  for  service  provision  has 
also  identified  some  promising  approximation  techniques.  They  showed  that  approximately 
optimal  commitments  for  service  provision  could  be  computed  in  a  greedy  manner  as  a  form  of 
distributed  binary  search  [WD2008,  WD2009a],  or  by  constraining  the  number  of  time  points 
considered  by  the  search[WD2009b,  WD2009c]. 

They  have  also,  in  the  Master’s  project  of  Jason  Sleight,  analyzed  the  degree  to  which 
approximating  durational  uncertainty  can  be  effective  [S201 1].  Specifically,  that  work  has 
emphasized  reducing  the  branching  factor  of  possible  future  trajectories  by  reducing  the  number 
of  different  possible  durations  that  a  service  might  take  down  to  a  smaller  number.  That  work  has 
proven  that  the  search  for  a  duration  approximation  should  never  consider  approximate  durations 
that  do  not  correspond  to  actual  possible  durations,  and  that  inexpensive  error  metrics  can 
fruitfully  act  as  proxies  for  the  expected  loss  of  utility  that  such  approximations  will  incur.  It  has 
also  developed  a  polynomial  time  (dynamic  programming)  algorithm  for  quickly  determining  the 
appropriate  approximation. 

In  addition,  that  work  has  demonstrated  the  importance  of  characterizing  the  execution-time 
behavior  of  an  agent  when  determining  the  “best”  approximation,  in  terms  of  what  the  agent  does 
when  the  actual  execution  of  a  service  takes  a  duration  that  was  excluded  from  the  approximation. 
The  work  has  shown  that  for  more  passive  execution-time  responses,  such  as  having  the  agent 
idle  unless/until  the  natural  evolution  of  the  world  leads  back  to  an  expected  state,  a  simple 
parameterization  of  the  dynamic  programming  algorithm  can  formulate  an  approximation  that  can 
better  account  for  this  execution-time  behavior.  The  graph  below  (left)  shows  how  different 
settings  of  the  parameter  (alpha)  affect  the  loss  of  expected  value  as  the  number  (K)  of  durations 
out  of  the  set  of  actual  discrete  durations  is  increased  from  1  to  all  10,  and  shows  the  algorithms 
computation  time  as  K  grows  (right).  These  types  of  performance-cost  profiles  provide 
information  to  a  system  developer  to  strike  the  right  tradeoff  for  a  particular  application. 


2.2  Distributed  Constraint  Optimization  for  Service  Assignment 

A  danger  in  conducting  temporal  planning  for  service  composition  arises  when  the  number  of 
requests  for  a  particular  service  could  outstrip  the  capabilities  of  the  provider  of  that  service. 
Faced  with  such  an  “over-constrained”  problem,  a  temporal  planner  can  spend  considerable  time 
searching  for  a  fully  satisfying  plan  by  trying  different  orderings  and  timings  of  service  provisions, 
when  in  fact  the  problem  is  inherently  unsolvable.  To  address  this  case,  the  Masters  degree 
project  of  Christopher  Portway  developed  a  preprocessing  step  for  efficiently  detecting  over¬ 
constrained  situations,  and  for  (heuristically)  indentifying  an  approximately  optimal  subset  of 
requests  that  could  be  collectively  achieved  with  the  provider’s  limited  resources. 

The  core  idea  behind  the  approach  has  been  to  cast  the  problem  of  deciding  on  service  requests 
to  fulfill  as  a  distributed  constraint  optimization  problem  (DCOP).  A  variety  of  algorithms  for 
solving  DCOPs  have  been  developed  in  the  past;  because  the  objective  in  this  case  is  to  optimize 
the  value  of  the  service  requests  satisfied  within  the  constraints  of  the  providers’  time  constraints, 
the  approach  adopted  in  this  project  has  built  off  of  the  multiply-constrained  (MC-)DCOP 
framework  developed  by  other  researchers.  However,  that  framework  required  significant 
augmentation  to  support  modeling  single  agents  each  reasoning  over  multiple  variables 
(requests),  leading  to  the  development  of  a  new  Multiple  Variable  (MV-)MC-DCOP  [PD2009]. 
Further  improvements  to  solution  quality  can  be  gained  by  including  in  the  formulation  the 
temporal  (precedence)  constraints  that  occur  when  services  sequentially  chain  their  contributions 
to  yield  an  overall  response  to  a  user’s  needs,  leading  to  the  Ordered  (O-)MV-MC-DCOP 
[PD2010a,  PD2010b]. 

These  extended  frameworks  were  implemented  in  the  multiagent  system  as  (O-)MV-MC-MGM, 
extending  the  efficient  (but  only  approximately  optimal)  MC-MGM  DCOP  approach,  that  was 
developed  by  others,  to  now  handle  multiple  variables  and  ordering  constraints.  Each  of  the 
service  providers  and  requesters  engage  in  the  (O-)MV-MC-MGM  protocol  to  converge  on  which 
subset  of  services  to  attempt  to  schedule,  and  only  once  this  trimming  to  the  problem  has  been 
done  do  these  agents  engage  in  the  more  costly  search  for  joint  influences  and  policies  as 
described  in  Section  2.1.  Empirically,  the  (O-)MV-MC-MGM  preprocessing  was  applied  to  larger 
problems,  and  was  shown  to  be  able  to  improve  expected  reward  and  to  reduce  overall 
computation  time  compared  to  solving  the  problems  using  the  approximate  methods  described  in 
Section  2.1 .  An  example  of  the  results  from  [PD2010b]  is  shown  below,  which  also  includes  the 
performance  of  a  centralized  mixed-integer  linear  program  (MILP)  technique  for  solving  the  MV- 
MC-DCOP. 
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2.3  Improved  Multiagent  Selection  and  Scheduling  Algorithms 

Selection  and  scheduling  of  services  (or  activities  more  generally)  are  intertwined  problems.  That 
is,  different  selection  choices  can  introduce  different  temporal  constraints  (e.g.,  “better”  services 
might  take  longer  to  provide),  and  timing  constraints  can  limit  selection  choices  (e.g.,  an 
impending  deadline  rules  out  the  “best”  but  longest  duration  service).  For  service  composition 
problems  involving  services  with  more  controllable/deterministic  durations,  the  doctoral  research 
of  PhD  student  Jim  Boerkoel  has  formulated  the  problem  as  a  Hybrid  Scheduling  Problem  (HSP). 
An  HSP  is  comprised  of  a  traditional  finite-domain  constraint  satisfaction  problem  (CSP) — where 
service  requests  are  variables,  for  example,  and  a  service  provider  value  needs  to  be  selected  for 
each — and  a  traditional  disjunctive  temporal  problem  (DTP) — where  start  and  end  times  of 
particular  service  provisions  must  be  assigned.  These  problems  are  coupled  by  “hybrid 
constraints”  that  express  relationships  between  selection  and  scheduling  assignments,  such  as 
that  if  service  provider  A  is  selected  to  satisfy  a  request,  then  the  timing  of  when  the  request  will 
be  satisfied  is  constrained  by  provider  A’s  prior  commitments. 

The  research  conducted  in  this  project  been  developing  techniques  for  automatically  deriving  and 
expressing  critical  implied  constraints  in  HSPs  that  in  turn  enable  state-of-the-art  constraint 
propagation  techniques  to  rapidly  converge  on  satisfying  solutions.  The  investigators  have  also 
developed  the  first  partially  and  fully  decentralized  triangulation-based  algorithms  for  solving 
multiagent  temporal  constraint  problems,  and  demonstrated  their  efficacy.  These  two  thrusts  are 
summarized  below. 

Hybrid  Constraint  Tightening 

Hybrid  Constraint  Tightening  (HCT)  is  an  algorithm  for  preprocessing  an  HSP  formulation, 
applying  constraint  compilation  principles  to  reformulate  hybrid  constraints  by  lifting  information 
from  the  structure  of  an  HSP  instance  [BD2008].  These  reformulated  constraints  elucidate 
implied  constraints  between  the  CSP  and  DTP  subproblems  of  an  HSP  earlier  in  the  search 
process  and  can  lead  to  significant  search  space  pruning.  Despite  the  computational  costs 
associated  with  applying  the  HCT  preprocessing  algorithm,  HCT  leads  to  orders  of  magnitude 
speedup  when  used  in  conjunction  with  off-the-shelf,  state-of-the-art  solvers,  as  compared  to 
solving  the  same  problem  instance  without  applying  HCT.  The  investigators  have  conducted  a 
systematic  exploration  of  the  properties  of  HSPs  that  influence  HCT's  efficacy,  and  have 
quantified  empirically  the  conditions  under  HCT  is  particularly  effective  [BD2009]. 

Multiagent  Algorithms  for  Temporal  Reasoning  and  Decoupling 

Activities  such  as  satisfying  service  requests  inherently  link  the  schedules  of  different  agents 
together  by  imposing  interagent  temporal  constraints,  such  as  when  a  service  requester  must 
wait  to  proceed  on  its  next  task  until  after  the  service  provider  has  returned  the  requested  result. 
Each  agent  also  has  intra-agent  (internal)  temporal  constraints,  among  which  is  the  constraint 
that  if  the  agent  is  performing  a  task,  then  the  start  time  of  any  other  task  it  plans  to  do  must  be 
no  sooner  than  the  end  time  of  the  task  it  is  performing.  As  part  of  this  project,  the  investigators 
have  formally  defined  the  MaSTP  -  multiagent  simple  temporal  problem  -  as  a  multiagent 
extension  of  the  traditional  STP  representation  that  also  accounts  for  constraints  between  agents’ 
schedules,  and  hence  the  definition  of  what  portions  of  each  agent’s  STP  is  private  to  that  agent, 
and  which  portions  other  agents  STPs  it  must  necessarily  be  aware  of  [BD2010a,  BD2010b]. 

Prior  research  on  solving  problems  that  can  be  encoded  in  the  MaSTP  formulation  have  instead 
modeled  the  problem  in  a  single,  centralized  way,  and  thus  not  only  fail  to  scale  well  to  larger 
multiagent  systems,  but  also  require  full  revelation  of  private  scheduling  information.  Prior 
solution  algorithms  thus  require  centralizing  the  problem  representation  at  some  "coordinator" 
who  calculates  a  (set  of)  solution  schedule(s)  for  all,  and  such  algorithms  can  incur  unacceptable 
computational,  communication,  and  privacy  costs  for  problems  like  service  composition  where 
problems  are  inherently  distributed  and  where  unnecessarily  propagating  awareness  of  what 


services  are  making  requests  of  what  other  services  might  be  risky.  That  is,  agents  that  specify 
problems  in  a  distributed  fashion  might  reasonably  expect  some  degree  of  privacy,  and  demand 
greater  latitude  for  rapid  scheduling  changes  than  can  be  supported  in  a  centralized  system. 

This  project  has  developed  new,  distributed  algorithms  for  finding  and  maintaining  a  (set  of) 
solution(s)  for  the  MaSTP.  The  investigators  have  proven  the  correctness,  privacy  implications, 
and  runtime  properties  of  each  of  these  algorithms.  They  have  also  empirically  evaluated  the 
algorithms’  costs  in  terms  of  both  time  and  communication. 

The  high-level  approach  is  to  decompose  problems  into  n  locally-independent  subproblems  that 
each  of  the  n  agents  can  solve  concurrently  and  privately,  and  one  shared  subproblem  for  which 
the  agents  must  work  together  to  solve.  This  decomposition  divides  variables  based  on  whether 
or  not  they  are  involved  in  external  constraints  and  so  exploits  the  natural,  loosely-coupled 
problem  structure  of  many  real-world  problems,  such  as  service  composition.  This  partitioning  is 
described  in  detail  for  the  MaSTP  in  [BD2010a],  and  is  used  to  prove  important  privacy  properties 
for  the  algorithms  and  to  empirically  demonstrate  how  this  structure  influences  algorithm 
performance.  Solving  the  MaSTP  is  an  important  precursor  for  solving  the  multiagent  versions  of 
more  complicated  scheduling  formulations  such  as  DTPs  and  HSPs,  which  often  require  quickly 
evaluating  partial,  candidate  assignments  in  the  form  of  component  STPs. 

The  algorithm  to  solve  the  MaSTP  employs  a  variable  elimination  procedure,  where  each  agent 
privately  and  concurrently  eliminates  its  private  local  variables  first,  and  then  coordinates  with 
other  agents  to  eliminate  its  externally  constrained  (shared)  variables.  The  algorithm  then 
performs  a  reverse  pass  that  calculates  the  full  set  of  possible  joint  solutions.  This  distributed 
algorithm  demonstrates  impressive  levels  of  speedup  over  comparable  centralized  approaches, 
especially  on  weakly-coupled  problems.  The  graph  below  compares  the  MaSTP  algorithm  where 
agents  process  their  private  problems  concurrently,  with  a  variation  that  processes  the  private 
problems  sequentially.  These  are  both  compared  to  solving  the  problem  in  the  standard 
centralized  way.  As  the  proportion  of  interagent  to  intra-agent  constraints  is  varied  from  low  (left) 
to  high  (right),  the  MaSTP  algorithms  display  a  speedup  of  between  5  and  2  orders  of  magnitude! 
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The  resulting  solution  to  the  MaSTP  provides  agents  with  the  maximal  amount  of  flexibility 
permitted  by  their  constrained  problems,  and  can  support  the  further  refinement  of  their 
schedules  to  converge  on  coordinated  activities.  However,  this  further  refinement  requires  that, 
each  time  an  agent  assigns  one  of  its  timepoint  variables,  all  other  agents  must  wait  until  that 
decision  has  been  propagated  across  the  network  before  any  other  agent  can  make  a  decision. 
Without  waiting  for  such  propagation  to  complete  risks  agents  making  incompatible  assignments 
that  render  the  joint  scheduling  problem  unsolvable. 


An  alternative  that  has  been  developed  in  this  project  is  to  allow  agents  to  heuristically  decouple 
themselves  from  the  subproblems  of  other  agents.  Informally,  a  decoupling  (e.g.,  a  temporal 
decoupling)  is  defined  in  terms  of  locally  independent  sets  of  solutions  that,  when  combined,  form 
a  solution  to  the  original  multiagent  constraint  problem.  Thus,  the  goal  for  each  agent  is  to  make 
search  decisions  (e.g.,  impose  new  intraagent  constraints)  that  render  interagent  (external) 
constraints  moot.  For  example,  if  a  service  requester  and  provider  agree  on  a  time  at  which  the 
service  provision  must  finish,  then  they  can  schedule  incorporate  this  more  precise  constraint  into 
their  local  models  and  know  implicitly  that  their  interagent  constraint  will  be  satisfied. 

The  investigators  have  augmented  the  MaSTP  algorithm  with  this  ability  to  heuristically  decouple 
agents’  problems,  resulting  in  a  solution  to  the  multiagent  temporal  decoupling  problem  (MaTDP) 
[BD201 1].  Specifically,  during  the  reverse  phase  of  MaSTP  execution,  agents  introduce  new 
constraints  into  their  shared  STP  that  temporally  decouple  their  local  subproblems.  Once  all 
subproblems  are  decoupled,  the  reverse  phase  continues  into  the  agents’  private  problems. 
Finally,  the  investigators  have  also  developed  an  algorithm  that  makes  a  forward  pass  through 
the  shared  problem  after  the  decoupling  to  relax  unnecessarily  tight  constraints  before  completing 
the  reverse  pass  through  the  agents’  private  problems. 

In  the  graph  below,  the  computational  time  for  solving  the  MaTDP  plus  relaxation  is  shown  to  be 
several  orders  of  magnitude  faster  than  the  previous  best  algorithm  for  solving  the  TDP  (labeled 
TDP)  which  did  so  in  a  centralized  way.  A  centralized  (sequential  rather  than  concurrent)  version 
of  the  MaTDP  algorithm  is  also  tested,  as  is  the  MaSTP  algorithm. 
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Further,  using  the  traditional  measure  of  performance  for  decoupling,  which  measures  the  rigidity 
in  a  decoupled  schedule  (where  finding  a  decoupling  that  minimizes  rigidity  is  good),  the 
MaTDP+R  technique  performs  nearly  as  well  as  the  state-of-the-art  centralized  TDP  algorithm, 
despite  the  MaTDP+R  heuristic  being  much  simpler  and  being  computed  in  a  distributed  manner. 
In  the  table  below,  the  minimal  rigidity  for  the  non-decoupled  case  is  compared  to  rigidity  for  TDP 
and  MaTDP+R  at  levels  of  external  constraints  (from  the  graph  above)  of  50,  200,  and  800. 


Algorithm 

N=50 

N=200 

N=800 

No  Decoupling 

0.418 

0.549 

0.729 

TDP 

0.482 

0.668 

0.865 

MaTDP+R 

0.496 

0.699 

0.886 

3.  Summary 

This  project  developed  and  evaluated  theoretically  sound  and  practically  relevant  techniques  for 
multiagent  sequential  decision-making,  distributed  constraint  reasoning,  and  multiagent 
scheduling  and  planning.  These  results  are  applicable  to  solving  problems  of  temporal  planning 
for  service  composition,  and  have  broader  promise  for  other  applications,  such  as  for  finding  and 
scheduling  human  experts  for  collaboration  to  solve  complex  problems  [DBS1 1], 
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